ASC & OZCOTS 2023
Monash University, Australia
Professor Dianne Cook, Department of Econometrics and Business Statistics, Melbourne, Monash University, Australia
Dr. Emi Tanaka, Biological Data Science Institute, Australian National University, Canberra, Australia
Assistant Professor Susan VanderPlas, Statistics Department, University of Nebraska, Lincoln, USA
Graphical approaches (plots) are the recommended methods for diagnosing residuals.
Residual plots are usually revealing when the assumptions are violated.
Formal tests and graphical procedures are complementary and both have a place in residual analysis, but graphical methods are easier to use.
Residual plots are more informative in most practical situations than the corresponding conventional hypothesis tests.
What do you observe from this residual plot?
However, this is an over-interpretation.
The fitted model is correctly specified!
The triangle shape is caused by the skewed distribution of the regressors.
This framework is called visual inference (Buja, et al. 2009).
A lineup consists of
This framework is called visual inference (Buja, et al. 2009).
Can you now identify the real residual plot?
It is not uncommon for residual plots (No. 11) of this model to exhibit a triangle shape.
The visual discovery is calibrated via comparison.
To understand why regression experts consistently recommend plotting residuals, we conducted an experiment to compare conventional hypothesis testing with visual testing in linear regression diagnostics.
\[\boldsymbol{y} = \boldsymbol{1}_n + \boldsymbol{x} + \boldsymbol{z} + \boldsymbol{\varepsilon},~ \boldsymbol{z} \propto He_j(\boldsymbol{x}) \text{ and } \boldsymbol{\varepsilon} \sim N(\boldsymbol{0}_n, \sigma^2\boldsymbol{I}_n),\]
where \(\boldsymbol{y}\), \(\boldsymbol{x}\), \(\boldsymbol{\varepsilon}\) are vectors of size \(n\), \(\boldsymbol{1}_n\) is a vector of ones of size \(n\), and \(He_{j}(.)\) is the \(j\)th-order probabilist’s Hermite polynomials.
\[\boldsymbol{y} = \beta_0 + \beta_1\boldsymbol{x} + \boldsymbol{u}, ~\boldsymbol{u} \sim N(\boldsymbol{0}_n, \sigma^2\boldsymbol{I}_n).\]
\[\boldsymbol{y} = 1 + \boldsymbol{x} + \boldsymbol{\varepsilon},~ \boldsymbol{\varepsilon} \sim N(\boldsymbol{0}, 1 + (2 - |a|)(\boldsymbol{x} - a)^2b \boldsymbol{I}),\]
where \(\boldsymbol{y}\), \(\boldsymbol{x}\), \(\boldsymbol{\varepsilon}\) are vectors of size \(n\), and \(\boldsymbol{1}_n\) is a vector of ones of size \(n\).
\[\boldsymbol{y} = \beta_0 + \beta_1\boldsymbol{x} + \boldsymbol{u}, ~\boldsymbol{u} \sim N(\boldsymbol{0}_n, \sigma^2\boldsymbol{I}_n).\]
We have chosen to use an approach based on Kullback-Leibler divergence (Kullback and Leibler, 1951).
The effect size is defined as
\[\begin{align*} E &= \frac{1}{2}\left(\log\frac{|\text{diag}(\boldsymbol{R}\boldsymbol{V}\boldsymbol{R}')|}{|\text{diag}(\boldsymbol{R}\widehat{\sigma}^2)|} - n + \text{tr}(\text{diag}(\boldsymbol{R}\boldsymbol{V}\boldsymbol{R}')^{-1}\text{diag}(\boldsymbol{R}\widehat{\sigma}^2)) + \boldsymbol{\mu}_z'(\boldsymbol{R}\boldsymbol{V}\boldsymbol{R}')^{-1}\boldsymbol{\mu}_z\right), \\ \boldsymbol{\mu}_z &= \boldsymbol{R}\boldsymbol{Z},\\ \boldsymbol{R} &= \boldsymbol{I}_n - \boldsymbol{X}(\boldsymbol{X}'\boldsymbol{X})^{-1}\boldsymbol{X}', \end{align*}\]
where \(diag(.)\) is the diagonal matrix constructed from the diagonal elements of a matrix, and \(\boldsymbol{V}\) is the actual covariance matrix of the error term.
We use the logistic regression to estimate the power:
\[Pr(\text{reject}~H_0|H_1,E) = \Lambda\left(log\left(\frac{0.05}{0.95}\right) + \beta_1 E\right),\]
where \(\Lambda(.)\) is the standard logistic function given as \(\Lambda(z) = exp(z)/(1+exp(z))\).
The effect size \(E\) is the only predictor.
The intercept is fixed to \(log(0.05/0.95)\) so that \(\hat{Pr}(\text{reject}~H_0|H_1,E = 0) = 0.05\).
Overall, we collected 7974 evaluations on 1152 unique lineups performed by 443 subjects recruited from an crowd-sourcing platform called Prolific (Palan and Schitter, 2018).
Every subject was asked to:
The visual test rejects less frequently than the conventional test, and (almost) only rejects when the conventional test does.
The data plot (No.1) is undistinguishable from other plots with an extremely small effect size (\(log_e(E) = -0.48\)).
The non-linearity pattern is totally undetectable.
However, the RESET test rejects the pattern with a very small \(p\text{-value} = 0.004\). In contrast, the \(p\text{-value}\) produced by the visual test is \(0.813\).
Conventional tests are more sensitive to weak departures than visual tests.
Conventional tests often reject when departures are not visibly different from null residual plots.
Visual tests perform equally well regardless of the type of residual departures and remove any subjective arguments about whether a pattern is visible or not.
Regression experts are right. Residual plots are indispensable methods for assessing model fit.
Slides URL: https://patrickli-dec11talk.netlify.app